Speech coding using mixture of gaussians polynomial model

نویسندگان

  • Parham Zolfaghari
  • Tony Robinson
چکیده

We have investigated a novel method of spectral estimation based on mixture of Gaussians in a sinusoidal analysis and synthesis framework. After quantisation of this parametric scheme a xed frame-rate coder operating at a bit-rate of around 2.4 kbits/s has been developed. This paper describes an extension to this spectral model based on constraining the parameters of the mixture of Gaussians to be on a polynomial trajectory over a segment of speech data. This is referred to as the mixture of Gaussians polynomial model (MGPM). In order to realise a segmental coder, dynamic programming over the utterance is performed. The segmental representation of the spectra results in a log-likelihood score over a segment which is used as the cost function in the dynamic programming algorithm. Speech coding components such as pitch, voicing and gain are described segmentally. A number of segmental coders are presented with bit-rates in the range of 350 to 650 bits/s. These coders offer good and intelligible coded speech evaluated using DRT scoring at these bit-rates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid speech recognizer combining HMMs and polynomial classification

In this paper, we present a hybrid speech recognizer combining Hidden Markov Models (HMMs) and a polynomial classifier. In our approach the emission probabilities are not modeled as a mixture of Gaussians but are calculated by the polynomial classifier. However, we do not apply the classifier directly to the feature vector but we make use of the density values of Gaussians clustering the featur...

متن کامل

Advanced Acoustic Modeling with the Hybrid HMM/BN Framework

Most of the current state-of-the-art speech recognition systems are based on HMMs which usually use mixture of Gaussian functions as state probability distribution model. It is a common practice to use EM algorithm for Gaussian mixture parameter learning. In this case, the learning is done in a ”blind”, data-driven way without taking into account how the speech signal has been produced and whic...

متن کامل

Speech modeling using variational Bayesian mixture of Gaussians

The topic of this paper is speech modeling using the Variational Bayesian Mixture of Gaussians algorithm proposed by Hagai Attias (2000). Several mixtures of Gaussians were trained for representing cepstrum vectors computed from the TIMIT database. The VB-MOG algorithm was compared to the standard EM algorithm. VB-MOG was clearly better, its convergence was faster, there was no tendency to over...

متن کامل

A new look at HMM parameter tying for large vocabulary speech recognition

Most current state-of-the-art large-vocabulary continuous speech recognition (LVCSR) systems are based on state-clustered hidden Markov models (HMMs). Typical systems use thousands of state clusters, each represented by a Gaussian mixture model with a few tens of Gaussians. In this paper, we show that models with far more parameter tying, like phonetically tied mixture (PTM) models, give better...

متن کامل

The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians

Today, most of the state-of-the-art speech recognizers are based on Hidden Markov modeling. Using semi-continuous or continuous density Hidden Markov Models, the computation of emission probabilities requires the evaluation of mixture Gaussian probability density functions. Since it is very expensive to evaluate all the Gaussians of the mixture density codebook, many recognizers only compute th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999